Validity of linear regression in method comparison studies: is it limited by the statistical model or the quality of the analytical input data?
نویسندگان
چکیده
We compared the application of ordinary linear regression, Deming regression, standardized principal component analysis, and Passing-Bablok regression to real-life method comparison studies to investigate whether the statistical model of regression or the analytical input data have more influence on the validity of the regression estimates. We took measurements of serum potassium as an example for comparisons that cover a narrow data range and measurements of serum estradiol-17beta as an example for comparisons that cover a wide data range. We demonstrate that, in practice, it is not the statistical model but the quality of the analytical input data that is crucial for interpretation of method comparison studies. We show the usefulness of ordinary linear regression, in particular, because it gives a better estimate of the standard deviation of the residuals than the other procedures. The latter is important for distinguishing whether the observed spread across the regression line is caused by the analytical imprecision alone or whether sample-related effects also contribute. We further demonstrate the usefulness of linear correlation analysis as a first screening test for the validity of linear regression data. When ordinary linear regression (in combination with correlation analysis) gives poor estimates, we recommend investigating the analytical reason for the poor performance instead of assuming that other linear regression procedures add substantial value to the interpretation of the study. This investigation should address whether (a) the x and y data are linearly related; (b) the total analytical imprecision (s(a,tot)) is responsible for the poor correlation; (c) sample-related effects are present (standard deviation of the residuals >> s(a,tot)); (d) the samples are adequately distributed over the investigated range; and (e) the number of samples used for the comparison is adequate.
منابع مشابه
Machine learning algorithms in air quality modeling
Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...
متن کاملEvaluation of hybrid fuzzy regression capability based on comparison with other regression methods
In this paper, the difference between classical regression and fuzzy regression is discussed. In fuzzy regression, nonphase and fuzzy data can be used for modeling. While in classical regression only non-fuzzy data is used. The purpose of the study is to investigate the possibility of regression method, least squares regression based on regression and linear least squares linear regression met...
متن کاملCar paint thickness control using artificial neural network and regression method
Struggling in world's competitive markets, industries are attempting to upgrade their technologies aiming at improving the quality and minimizing the waste and cutting the price. Industry tries to develop their technology in order to improve quality via proactive quality control. This paper studies the possible paint quality in order to reduce the defects through neural network techniques in au...
متن کاملA Model for Project Selecting with Limited Resources in Data Envelopment Analysis with Input and Output Fuzzy
In Evaluating Performance, Selecting a Subset from a Set of Solutions with Limited Resources is Essential. If There Is More Than One Input and Output, the Data Rnvelopment Analysis Optimization Models Are Evaluated and Performance Measurement Based on the Weighted Output Is Divided Weighted Input. In This Research, Two Models of Optimization with Limited Resources Present from Data Envelopment ...
متن کاملA NEW APPROACH FOR PARAMETER ESTIMATION IN FUZZY LOGISTIC REGRESSION
Logistic regression analysis is used to model categorical dependent variable. It is usually used in social sciences and clinical research. Human thoughts and disease diagnosis in clinical research contain vagueness. This situation leads researchers to combine fuzzy set and statistical theories. Fuzzy logistic regression analysis is one of the outcomes of this combination and it is used in situa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Clinical chemistry
دوره 44 11 شماره
صفحات -
تاریخ انتشار 1998